feat(integrations): slice 3 — daily Datadog sync + cron + UI surfacing (#15)#40
Merged
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
This was referenced May 13, 2026
trentas
added a commit
that referenced
this pull request
May 13, 2026
Bumps platform/package.json (1.0.0 → 1.0.6 — catching up from the initial scaffold), pyproject.toml (1.0.5 → 1.0.6), and iris/cli.py:VERSION (v1.0.5 → v1.0.6). Adds the CHANGELOG entry for v1.0.6 covering the Datadog integration end-to-end across slices 1-5 (PRs #36, #37, #39, #40, #41, #42). Highlights: - Connect flow + encrypted credentials (slice 2) - Daily Vercel Cron sync into external_deployments / _commits / _incidents (slice 3) - Engine consumes events and emits 18 new dora_* fields including CFR / MTTR per-deploy / MTTR per-incident / rollback rate / lead time / deploy frequency / by-origin breakdowns (slice 4 + 5) - Dashboard DORA section with the "Datadog" badge and the AI-vs-human correlation card (slice 5) - Setup docs at docs/integrations/datadog.md Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ts + incidents (#15) Adds the three tables that the daily Datadog sync writes into: - external_deployments — DORA deployment events (one row per Datadog event id). Tri-state change_failure (TRUE | FALSE | NULL = pending evaluation), recovery_time_sec and remediation_* for per-deploy MTTR, and dd_repository_id retained for debuggability when repo matching fails. UNIQUE (provider, provider_event_id) for idempotent upsert. - external_deployment_commits — per-commit detail unpacked from attributes.commits[]. This is the join key for the AI-vs-human CFR correlation in slice 5 (commit_sha ↔ commit_origin.commit_sha). - external_incidents — DORA failure events. service / env / team as TEXT[] because Datadog returns them as arrays; GIN index on service for the per-service incident queries. Schema and field choices match the production probe in docs/PLAN-datadog.md §9.2 (sampled 500 deploys / 5 failures on a real tenant); no RLS, consistent with the rest of the iris-specific tables. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
#15) Builds the daily ingestion pipeline on top of the slice 3 migrations: - platform/lib/integrations/datadog/client.ts — adds listDeployments and listFailures wrappers around POST /api/v2/dora/{deployments,failures}. Time-slicing pagination is implemented in the caller; the client just shapes the request body and surfaces structured errors. - platform/lib/integrations/datadog/sync.ts — the per-org sync. Cursor by last_sync_at with a 30-day default backfill, idempotent upsert by (provider, provider_event_id), per-commit join table populated from attributes.commits[], DD repository_id ↔ repositories.remote_url matched via normalized slug, anti-spin guard for the §9.5 boundary case. Status / last_error / last_sync_at flip on the org_integrations row at the end of each run. - platform/src/app/api/cron/sync-integrations/route.ts — Vercel Cron handler. Auth-gated by CRON_SECRET (Bearer or x-cron-secret header); in non-production with no secret configured, the route is open so `npm run dev` can hit it directly. Iterates active integrations and fans out per-provider sync sequentially within the 300s budget. - platform/vercel.json — registers the cron at 0 4 * * * UTC. - env.example — documents CRON_SECRET and how to generate it. - platform/src/app/[tenant]/settings/integrations/[provider]/page.tsx + datadog-connect-form.tsx — surfaces the unmatched-deployments count alongside last_sync_at / last_error / connected_at on the detail page so customers can spot repo-mapping drift early. - platform/tests/datadog-sync.test.ts — covers slug normalization, pagination-helper edge cases, and the Datadog zero-value timestamp guard observed in pull_requests[] during the §9.2 probe. Engine integration (dora_real.py) and the dashboard surfacing land in slices 4 and 5. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
5a87157 to
8a147f0
Compare
trentas
added a commit
that referenced
this pull request
May 13, 2026
Bumps platform/package.json (1.0.0 → 1.0.6 — catching up from the initial scaffold), pyproject.toml (1.0.5 → 1.0.6), and iris/cli.py:VERSION (v1.0.5 → v1.0.6). Adds the CHANGELOG entry for v1.0.6 covering the Datadog integration end-to-end across slices 1-5 (PRs #36, #37, #39, #40, #41, #42). Highlights: - Connect flow + encrypted credentials (slice 2) - Daily Vercel Cron sync into external_deployments / _commits / _incidents (slice 3) - Engine consumes events and emits 18 new dora_* fields including CFR / MTTR per-deploy / MTTR per-incident / rollback rate / lead time / deploy frequency / by-origin breakdowns (slice 4 + 5) - Dashboard DORA section with the "Datadog" badge and the AI-vs-human correlation card (slice 5) - Setup docs at docs/integrations/datadog.md Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
trentas
added a commit
that referenced
this pull request
May 13, 2026
#15) (#42) * feat(dashboard): DORA section + CFR-by-origin correlation + setup docs (#15) Slice 5 — closes the Datadog integration loop. The dashboard now surfaces the dora_* metric family with a "Datadog" badge, an AI-vs-human CFR correlation card backed by a per-commit join, and a silent-decay guard on the integration detail page. Final piece: customer-facing setup documentation. Engine (Python): - iris/analysis/dora_real.py — new ``cfr_by_origin`` / ``rollback_rate_by_origin`` breakdowns when the aggregator passes the local commit-origin map. Per-commit join: each commit on each evaluated deploy is bucketed by its origin; commits not present in the local window are dropped silently but reflected in coverage_pct so the dashboard can warn when attribution is thin. - iris/metrics/aggregator.py — passes ``origin_map`` through to ``analyze_dora_real``. - iris/models/metrics.py — adds ``dora_cfr_by_origin`` and ``dora_rollback_rate_by_origin``. - tests/test_dora_real.py — 4 new tests covering the per-commit join, unknown-commit handling with coverage reporting, rollback filtering, and the no-origin-map default. Platform: - src/types/metrics.ts — TS mirrors of the two new dora_* fields. - src/types/org-summary.ts — new OrgDORA aggregation type. - lib/queries/org-summary.ts — computeDORA() sums deploys / failures / rollbacks across repos, weights CFR by evaluated deploys, and aggregates the by-origin breakdown. Returns null when no repo has an active integration. - src/app/[tenant]/dashboard/sections/DORAOverview.tsx — headline cards (CFR, MTTR per failed deploy, deploy frequency, lead time) plus a "Datadog" badge, a fact strip (deploys / rollback rate / pending), and the CFR-by-origin + rollback-rate-by-origin correlation tables. The correlation card stays hidden until the org has ≥ 10 failed deploys (per §9.6 — was 10 incidents pre-revision). - src/app/[tenant]/dashboard/page.tsx — wires the new section in. - src/components/integrations/datadog-connect-form.tsx + src/app/[tenant]/settings/integrations/[provider]/page.tsx — the §9.8 silent-decay hint: "last incident registered X days ago" on the detail page. Days are server-computed to keep the client component pure. - platform/lib/translations.ts — full en + pt-br copy for the new surfaces. - platform/tests/dora-aggregation.test.ts — 4 tests for computeDORA(). Docs: - docs/integrations/datadog.md — customer setup guide. Covers the Application Key scope, regional sites, the connect flow, the cron schedule, what we read / don't read, repository matching, the disconnect behavior, and operational notes (backfill window, rate limits, encryption rotation). - docs/METRICS.md — adds the two new dora_*_by_origin fields and the module-map row. Verified: - python -m pytest tests/ -q → 113 passed (4 new) - platform: npx tsc --noEmit → clean - platform: npx vitest run → 175 passed (4 new) - platform: npx eslint → clean Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(integrations): coverage_pct math + dead code + form gating on error (#15) Three issues surfaced in the slice 5 audit: 1. `iris/analysis/dora_real.py` — the per-origin `coverage_pct` divided each origin's commits by `(this origin + ALL unknowns)`, so every origin's coverage dropped by the full unknown count. The right semantic is org-wide attribution coverage. Hoisted to a single result field `cfr_by_origin_coverage_pct` and removed from each per-origin dict. 2. Same file — `_referenced` was assigned and immediately popped from the dict; dead code, dropped. 3. `platform/src/components/integrations/datadog-connect-form.tsx` — the connected card only rendered when `status === "active"`, so an integration in `status: "error"` fell through to the connect form and lost the very surfaces (last_sync_at, last_error, unmatched count, days-since-last-incident) the operator needs to debug. Now renders the status card for both `active` and `error`, with the shield icon and copy switched to an error variant when the sync is failing. Schema / TS / docs aligned: - `iris/models/metrics.py` adds `dora_cfr_by_origin_coverage_pct`. - `iris/metrics/aggregator.py` wires it. - `platform/src/types/metrics.ts` drops `coverage_pct` from the per-origin shape and adds the new top-level field. - `docs/METRICS.md` updates the field table and the explanatory blurb; module-map row picks up the new field. - `platform/lib/translations.ts` — en + pt-br copy for the new error state. Tests: - `tests/test_dora_real.py` — old `coverage_pct` assertion replaced by two focused tests (mixed known/unknown drops org-wide coverage; full attribution reports 100%; no origin map → None). - `platform/tests/dora-aggregation.test.ts` — adjusts mock payloads to drop the (now-removed) `coverage_pct` field on per-origin entries. Verified: pytest 115 passed (16 dora_real tests), tsc clean, vitest 175 passed, eslint clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(platform): restore build version in footer (#15) The footer's `process.env.NEXT_PUBLIC_BUILD_VERSION || "dev"` lookup fell back to "dev" on every Vercel deploy because the env var was never wired up. Reads from `package.json` at config-load time and appends the Vercel commit SHA (`VERCEL_GIT_COMMIT_SHA`) when present so production / preview deploys carry a unique identifier between releases. Loaded via fs instead of an ESM JSON import to stay portable across Next's TS loader and direct Node ESM execution (the latter requires `with { type: "json" }`). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore(release): v1.0.6 — Datadog DORA integration (#15) Bumps platform/package.json (1.0.0 → 1.0.6 — catching up from the initial scaffold), pyproject.toml (1.0.5 → 1.0.6), and iris/cli.py:VERSION (v1.0.5 → v1.0.6). Adds the CHANGELOG entry for v1.0.6 covering the Datadog integration end-to-end across slices 1-5 (PRs #36, #37, #39, #40, #41, #42). Highlights: - Connect flow + encrypted credentials (slice 2) - Daily Vercel Cron sync into external_deployments / _commits / _incidents (slice 3) - Engine consumes events and emits 18 new dora_* fields including CFR / MTTR per-deploy / MTTR per-incident / rollback rate / lead time / deploy frequency / by-origin breakdowns (slice 4 + 5) - Dashboard DORA section with the "Datadog" badge and the AI-vs-human correlation card (slice 5) - Setup docs at docs/integrations/datadog.md Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
10 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Stacked on top of #39 (slice 2). Re-target to
mainafter #39 merges, then rebase this branch onto main so reviewers see a clean diff.Summary
Slice 3 of #15: ingest DORA events daily and persist them so slice 4/5 can read them.
015_external_deployments,016_external_deployment_commits,017_external_incidents. Schema follows §9.2 ofdocs/PLAN-datadog.md(tri-statechange_failure,recovery_time_sec,remediation_*,service/env/teamastext[]on incidents). UNIQUE(provider, provider_event_id)makes the upsert idempotent.listDeployments/listFailureswrapPOST /api/v2/dora/{deployments,failures}with the body shape from §8 (data.type = "dora_*_list_request", ISO 8601from/towith trailingZ).platform/lib/integrations/datadog/sync.ts) — per-org pipeline:last_sync_at; 30-day default backfill on first run.(provider, provider_event_id); deployment commits upserted on the composite PK so reruns don't fail.normalizeRepoSlug— handlesgit@host:org/repo.git,ssh://,https://,.git,www., and trailing slashes.statustoerrorand writeslast_erroron failure; resets toactivewith a freshlast_sync_aton success.vercel.jsonregisters0 4 * * *UTC against/api/cron/sync-integrations. Route is auth-gated byCRON_SECRET(Authorization: Bearer …from Vercel Cron, orx-cron-secretfor manual dev triggers); in non-production with no secret, the route is open fornpm run devergonomics.last_sync_at/last_error. Translation keys in en-US + pt-BR.tests/datadog-sync.test.tscovers slug normalization across all common remote-URL shapes, pagination-helper edge cases, and the DD zero-value timestamp guard.Open decisions (default applied, redirect welcome)
0 4 * * *UTC (01:00 BRT) — per §7 build(deps): Bump actions/checkout from 4 to 6 #2.Out of scope (deferred to slice 4 / 5)
iris/analysis/dora_real.py(CFR + MTTR computation from external_*)DatadogvsEstimated)Test plan
mainand re-target tomainnpx supabase migration upapplies 015/016/017 cleanlyCRON_SECRETin Vercel (Preview + Production)GET /api/cron/sync-integrationswithx-cron-secret: …; verify response showssucceeded: 1, failed: 0for the active integrationexternal_deployments— every row should have a non-nullstarted_at, idempotent rerun produces zero net changesexternal_deployment_commitsis populated withchange_lead_time/time_to_deployper commitexternal_incidents.serviceis a text array and the GIN index is usablenpx vitest run tests/datadog-sync.test.ts— 13 tests pass🤖 Generated with Claude Code